210 research outputs found
A network tomography approach for traffic monitoring in smart cities
Various urban planning and managing activities required by a Smart City are feasible because of traffic monitoring. As such, the thesis proposes a network tomography-based approach that can be applied to road networks to achieve a cost-efficient, flexible, and scalable monitor deployment. Due to the algebraic approach of network tomography, the selection of monitoring intersections can be solved through the use of matrices, with its rows representing paths between two intersections, and its columns representing links in the road network. Because the goal of the algorithm is to provide a cost-efficient, minimum error, and high coverage monitor set, this problem can be translated into an optimization problem over a matroid, which can be solved efficiently by a greedy algorithm. Also as supplementary, the approach is capable of handling noisy measurements and a measurement-to-path matching. The approach proves a low error and a 90% coverage with only 20% nodes selected as monitors in a downtown San Francisco, CA topology --Abstract, page iv
Just Fine-tune Twice: Selective Differential Privacy for Large Language Models
With the increasing adoption of NLP models in real-world products, it becomes
more and more important to protect these models from privacy leakage. Because
private information in language data is sparse, previous research formalized a
Selective-Differential-Privacy (SDP) notion to provide protection for sensitive
tokens detected by policy functions, and prove its effectiveness on RNN-based
models. But the previous mechanism requires separating the private and public
model parameters and thus cannot be applied on large attention-based models. In
this paper, we propose a simple yet effective just-fine-tune-twice privacy
mechanism to first fine-tune on in-domain redacted data and then on in-domain
private data, to achieve SDP for large Transformer-based language models. We
also design explicit and contextual policy functions to provide protections at
different levels. Experiments show that our models achieve strong performance
while staying robust to the canary insertion attack. We further show that even
under low-resource settings with a small amount of in-domain data, SDP can
still improve the model utility. We will release the code, data and models to
facilitate future research
Characterization of behaviour and hazards of fire and deflagration for high-energy Li-ion cells by over-heating
Fire and deflagration are extreme manifestation of thermal runaway (TR) of Li-ion cells, and they are characterized for fully charged LiNiCoAlO2 (LNCA) 18650 cells in this investigation. The cells are over-heated using a cone calorimeter under different incident heat fluxes. When the cells are exposed to the incident heat flux larger than 35 kW m−2, both fire and deflagration present. The pressure valve opens when the temperature of the cell is higher than 132 °C. The fire occurs with the valve opening when the concentration of the venting vapour in the air is higher than the lower flammability limit. The deflagration happens after the cell temperature arrives about 200 °C, and is mainly arising from the cathode decomposition, the combustion of solvents and the anode relevant thermal reactions. The extreme temperatures of the cell and the flame during deflagration are over than 820 and 1035 °C, respectively. The production of COx, mass loss, heat release rate (HRR) are quantitative identified, and are found increase as the increasing incident heat flux. Based on revised oxygen consumption method, the HRR and liberated heat during the fire and deflagration for the cells are up to 11.8 ± 0.05 kW and 163.1 ± 1.5 kJ, respectively
Towards Efficient Data Valuation Based on the Shapley Value
"How much is my data worth?" is an increasingly common question posed by
organizations and individuals alike. An answer to this question could allow,
for instance, fairly distributing profits among multiple data contributors and
determining prospective compensation when data breaches happen. In this paper,
we study the problem of data valuation by utilizing the Shapley value, a
popular notion of value which originated in coopoerative game theory. The
Shapley value defines a unique payoff scheme that satisfies many desiderata for
the notion of data value. However, the Shapley value often requires exponential
time to compute. To meet this challenge, we propose a repertoire of efficient
algorithms for approximating the Shapley value. We also demonstrate the value
of each training instance for various benchmark datasets
GLoRE: Evaluating Logical Reasoning of Large Language Models
Recently, large language models (LLMs), including notable models such as
GPT-4 and burgeoning community models, have showcased significant general
language understanding abilities. However, there has been a scarcity of
attempts to assess the logical reasoning capacities of these LLMs, an essential
facet of natural language understanding. To encourage further investigation in
this area, we introduce GLoRE, a meticulously assembled General Logical
Reasoning Evaluation benchmark comprised of 12 datasets that span three
different types of tasks. Our experimental results show that compared to the
performance of human and supervised fine-tuning, the logical reasoning
capabilities of open LLM models necessitate additional improvement; ChatGPT and
GPT-4 show a strong capability of logical reasoning, with GPT-4 surpassing
ChatGPT by a large margin. We propose a self-consistency probing method to
enhance the accuracy of ChatGPT and a fine-tuned method to boost the
performance of an open LLM. We release the datasets and evaluation programs to
facilitate future research
- …